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Abstract 

An integral equation based scheme is presented for the fast and accurate 
computation of effective conductivities of two-component checkerboard-like 
composites with complicated unit cells at very high contrast ratios. The 
scheme extends recent work on multi-component checkerboards at medium 
contrast ratios. General improvement include the simplification of a long- 
range preconditioner, the use of a banded solver, and a more efficient place- 
ment of quadrature points. This, together with a reduction in the number 
of unknowns, allows for a substantial increase in achievable accuracy as well 
as in tractable system size. Results, accurate to at least nine digits, are 
obtained for random checkerboards with over a million squares in the unit 
cell at contrast ratio 10 6 . Furthermore, the scheme is flexible enough to 
handle complex valued conductivities and, using a homotopy method, purely 
negative contrast ratios. Examples of the accurate computation of resonant 
spectra are given. 

Keywords: Random checkerboard, Homogenization, Integral equation, 
Fast solver, Metamaterial 



1. Introduction 

This paper is devoted to solving the electrostatic equation for periodic 
composites with unit cells made of squares of conductivity a-i that are ei- 
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ther mixed with other squares of conductivity 01, to form a checkerboard 
structure, or simply embedded in a background material of conductivity o\. 
There are N sq squares in the unit cell and the area fraction of squares with 
conductivity 2 is denoted p. The goal is to compute the effective conduc- 
tivity 0* rapidly, with high accuracy, and for almost any combination of 01, 
cr 2 , and p for which the electrostatic equation has a solution. 

1.1. Motivation and challenges 

There are several applications that motivate our study. The homogeniza- 
tion of checkerboards with random unit cells at high real valued contrast ra- 
tios 02/01 (or 01/02) is a classic problem in materials science. It is of interest 
to study how 0* depends on p and in particular what happens when one type 
of squares forms a connected path throughout the composite (percolation). 
The contrast ratio can be considerable in materials of technological impor- 



tance. A ratio of 10 is not unusual |23(. Very large N sq are then needed to 



reach convergence to statistical limits. See Chapters 10.10 and 10.11 of [25 
for a review of this field. In the metamaterial community there is a strong 
interest in a related issue, namely how to compute resonant spectra of effec- 
tive dielectric permittivity functions (spectra of plasmonic excitations) for 
composites made of polygonal metamaterial inclusions embedded in a dielec- 
tric background material 0, |27| • The electrostatic equation is the same for 
conducting and for dielectric materials. Only the notation differs, see Ta- 



ble on p. 19 of 25]. For simplicity, we will talk about conductivity in this 
context, too. Of particular interest is the behavior of 0* close to values of 
02/01 where the electrostatic equation does not have a solution or only has 
a solution as a limit in the complex 02/01-plane. 

The computational tasks just discussed offer extreme challenges. Non- 
smooth interfaces tend to make solutions singular and hard to resolve. The 
electric fields close to certain corner vertices may just barely be square inte- 
grable. As N sq grows, the interaction between distantly separated parts in 
the computational domain may cause problems which cannot be resolved by 
discretization and local techniques alone. Being in the vicinity of parame- 
ter combinations where the electrostatic equation ceases to have a solution 
is often hard. All these difficulties add up and may manifest themselves 
as artificial ill-conditioning, slow convergence with mesh refinement, criti- 
cal slowing down in iterative solvers, and severe loss of precision. Several 
methods have been suggested to alleviate these problems including variants 
of the finite element method [H, @], network models 23, 12], renormalization 
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schemes 
tion (H 



201 ]. mode-matching methods 27j], and Brownian motion simula- 



See also Section 3 of 26] for state-of-the-art algorithms to combat 
critical slowing down in network models and {§] for a discussion of future 
directions in the research field at large. 

1.2. Our scheme 

Let T denote the boundary (the interfaces) of a composite. We shall 
reformulate the electrostatic equation as a Fredholm second kind integral 
equation 

(I + K)»(z) = g(z), zeT, (1) 

where I is the identity, K is an integral operator which is compact on smooth 
T, n(z) is an unknown layer density, and g(z) is a right hand side. 

Solvers for large-scale boundary values problems on smooth domains often 
rely on integral equation reformulations of the form ([1]). The last few years 
have seen increased activity in the development of efficient solvers using ([T]) 
also when T is non-smooth. The scheme of the present paper originates from 



work on non-smooth inclusion problems in free-space [17|]. The ideas in 



were later improved and extended to encompass the biharmonic equation [18] , 



mixed boundary conditions [13j], singular integral equations with non-zero 



indices [14(, and boundaries with quadruple-junctions [15] . The present paper 



is a direct sequel to [15j. As in [15], we apply a combination of short- and 



long-range preconditioners to (1T1). Major new features include: 

• A better strategy for choosing quadrature nodes which makes the error 
in a* grow linearly with contrast ratio. In 15J the growth is superlinear. 



An improved long-range preconditioner which makes the computational 
cost grow almost linearly with N sq . In 15] the growth is cubic. 



• A homotopy-type method which allows for computing a* at points in 
the complex a 2 /o"i-plane where the solution to the electrostatic equa- 
tion only exists as a non- unique limit. 

In addition there are several minor improvements. 

1.3. Relation to the Bremer-Rokhlin scheme 

Other recent work on the efficient solution of ([1]) in the presence of 
non-smooth boundaries includes [5J, which exploits cancellation of singulari- 
ties, and a comprehensive mechanism currently being developed by a group 
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around Bremer and Rokhlin 0, E|- Let us discuss the relation of the 
Bremer-Rokhlin scheme to our scheme. 

Both schemes take as a starting point the observation that an accurate 
and economical discretization of ([1]) can only be effected by restricting the 
operator on the left hand side to a finite-dimensional subspace, determined by 
the right hand side g{z). High resolution in combination with compression 
is used as a means to achieve this. The result is a kind of precomputed 
purpose-made composite quadrature, suitable for Nystrom discretization. 

The Bremer-Rokhlin scheme employs an elaborate machinery to con- 
struct families of 'universal quadratures'. Each universal quadrature is ap- 
propriate for the discretization of a given integral equation over an entire 
class of boundary segments with complicated geometry. When the integral 
equation depends on material parameters, in addition to geometry, more 
universal quadratures are needed. The approach has the advantage that the 
precomputation is done once and for all and can be stored on disk. When 
solving a particular problem involving many boundary singularities of similar 
shapes, only a few universal quadratures need to be activated. Our scheme 
precomputes 'quadrature- weighted inverses' afresh around every boundary 
singularity. This offers greater flexibility when applying the scheme to new 
situations and opens up for a parallel implementation, but requires more 
RAM storage. 

Another difference between the two schemes is the way in which the pro- 
cess of resolution and compression is carried out. The Bremer-Rokhlin com- 
pression is done via a series of solutions of large linear systems followed by 
rank-revealing QR decompositions. Our scheme deals with resolution and 
compression in tandem, using a fast and stable recursion. No large linear 
systems are ever set up. This is an advantage for boundary segments where 
extremely high resolution is needed. 

The schemes also differ in the assumptions made on g(z). The Bremer- 
Rokhlin scheme assumes that g(z) is a restriction to T of a function that 
satisfies the underlying partial differential equation in a neighborhood of 
each point on V. This assumption applies, for example, to certain important 
acoustic scattering problems. Our scheme only assumes that g(z) is piece- 
wise smooth. This is an advantage when the computational domain models 
granular materials or materials containing branching cracks. 

An open question is how easily the two schemes generalize to three di- 
mensions. Perhaps one can combine their best features? 
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l.Jf.. Organization of 'the paper 

The paper is divided into eight sections. Section [2] introduces unit cells 
and integral equations that will be used in all examples. Section [3] is on 
discretization. The leading ideas in our compression scheme are summa- 
rized in Sections H] and |5j Section [6] is on implementation. This material is 
essential for the understanding of how limits are taken in the complex 0-2/0-1- 
plane and how the compression of inverses of giant matrices corresponding 
to intensely resolved integral operators can be executed in sub-linear time. 
Section [7] presents improvements to the long-range preconditioner proposed 
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15] . The paper ends in Section with some truly large-scale and accurate 
numerical examples for random checkerboards along with the computation 
of resonant spectra of two metamaterial composites. The reader interested 



in more examples is referred to a forthcoming paper |16| . 

2. Integral equations for the electrostatic problem 

We shall solve the electrostatic partial differential equation on three types 
of doubly-periodic domains in a plane D: square arrays of squares, staggered 
arrays of squares, and two-component random checkerboards. An average 
electric field e = (e x , e y ) of unit strength is applied to D and we seek the 
potential U(r) for the computation of cr* in direction e 

a m = I (a(r)VU(r) ■ e) dV r , (2) 

J D 

where a(r) is the local conductivity, dV r is an infinitesimal area element, and 
the unit cell D is [—1/2, 1/2) x [—1/2, 1/2). We make no distinction between 
points or vectors in a real plane ¥L 2 and points in a complex plane C. From 
now on, all points will be denoted z or r. 

The interfaces r in D are given orientation. The restriction of T to D 
is denoted T and the outward unit normal of T at z is n z = n(z). Corner 
vertices are denoted j}.. Obviously, a(z) may jump as Y is crossed. Let cr + (z) 
denote the conductivity on the positive side of Y at z, let cr_(z) denote the 
conductivity on the negative side, and introduce as in 15 

a(z) = cr + (z) — ct-(z) , z G T , (3) 

b(z) = a + (z) + a^z) , zeY, (4) 

c(z) = a + (z)a^(z) , zGT, (5) 

\(z) = a(z)/b(z), zeY. (6) 
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Figure 1: Left: a cutout of a square array of squares with p = 0.5 and a unit cell with a 
16-panel coarse mesh on Tq. Right: a staggered array of squares with p = 0.4 and a unit cell 
with a 32-panel coarse mesh on Tq. The vertex separation distance is d and the dots indicate 
double-corner concentration points 8k- 



Our domains exhibit similarities, but they also differ in important re- 
spects. Different integral equation reformulations will be used for efficiency. 

2.1. Ordered arrays of squares 

The two ordered arrays are made by placing squares with conductivity 
o"2 in a plane with conductivity <j\. Fig. [1] shows cutouts of unit cells. The 
orientation of T is positive. The arrays are overall isotropic, so a* is indepen- 
dent of e. The points in between neighboring corner vertices in the staggered 
array are called double- corner concentration points and denoted 5^. 

The conductivity o~2 may be complex valued while o~\ is assumed real. 
The special case of real valued and negative ratios 02/01 poses a particular 
challenge. The electrostatic equation may not have a unique solution and 
this property is then carried over to the integral equation. Sometimes cr*, 
viewed as a function of a 2 with o~\ held constant, has a well defined limit 
which depends on whether 02/ 'o~\ approaches the negative real axis from 



above or from below in the complex plane. Hetherington and Thorpe [19 
argue that such a branch cut occurs for 0* of composites with right-angled 
interfaces whenever 02/01 G [—3, —1/3]. See also p. 378 of Milton j25|. We 
shall capture the limit of 0* from above. 

We follow standard practice for inclusion problems and represent U(z) 
as a continuous function which is a sum of a driving term and a single-layer 
potential with density p(z) [10] . Enforcing continuity of the normal current 
across T we arrive at the integral equation 

p(z) + M [ p{T) %fWl±\ = 2\(z)Sl{en z } , zeT , (7) 



7T Vr 1 T — z 
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Figure 2: Left: a cutout of a two-component ordered checkerboard. Middle and right: A unit 
cell Do of a random checkerboard with jV sq = 16 squares. Orientation of To (solid lines) and 
Lq \ T\ (dashed lines) along with a point z 6 Ti and its periodic image z pcr £ L \ F\. 



where the 'bar' symbol denotes complex conjugation. We observe that (J7J) 
is a Fredholm integral equation of the second kind with an integral operator 
which is compact away from the corner vertices. 

The parameter X(z) in ((?]) is independent of z. Should the integral op- 
erator in ([7]) have been compact everywhere, then, in a finite portion of the 
complex A-plane, there could exist a finite number of values |A| > 1, called 
eigenvalues of the equation, for which the solution p[z) may not be unique 



or may not even exist as a limit. See Sections 8 and 38 of Mikhlin [24 
If, however, ([7]) can be solved for p{z) and under the assumption that the 
inclusions do not overlap the unit cell boundary, the effective conductivity 
can be computed from 

0* = 0i + 0i / p(z)$i {ez} d\z\ . (8) 

Depending on how the unit cell is chosen, the squares in the staggered 
array may overlap the unit cell boundary. With the choice in Fig. [TJ they 
certainly do. But since the layer density p(z) is periodic and identical on all 
squares one can modify (E]) so that it integrates p(z) twice on the square at 
the center of the unit cell and ignores p(z) on the other squares. 

2.2. Checkerboards 

Fig. |2] shows checkerboards. The squares in D have either high conduc- 
tivity 2 or low conductivity a\. Here cr 2 and a x are real so that o^/ "! > 1- 



7 



Figure 3: Left: a coarse mesh on Tq for a checkerboard unit cell with N sq = 16. There are 
four quadrature panels on each square side. Right: local meshes A4b and Ai c centered around 
a corner vertex. There will be 192 discretization points on A4b and 128 points on A4 C . 

The challenge is to achieve linear complexity and high accuracy in difficult 
situations. 

The middle image of Fig. [2] is from a random checkerboard with N sq = 16. 
The right image indicates T by solid lines. The boundary of -Do is denoted 
L and Ti = T H L . Note that some or all squares that meet at a corner 
vertex 7^ could have the same conductivity. Vertices where two squares of 
conductivity o\ and two squares of conductivity o~i meet diagonally, like in 
the left image of Fig. [2} will be referred to as special corner vertices. 

An efficient Fredholm second kind integral equation for checkerboard 
problems can be derived by applying Green's third identity to the periodic 
function U(z) — ^{ez}. In terms of a transformed potential u(z), this double- 
layer type equation assumes the simple form 

«(z)-^/«(r)s( — } = 2&{e1 W}, z € T , (9) 

where Iq{z) is zero for z G T \ Ti and equal to the vector difference of z 
and its periodic image z per E L \ T% for z G Ti, see Section 2.2 of [l5j]. The 
effective conductivity can be computed from 

a, = / u{z)Q{edz} . (10) 

We observe that the integral operator in is compact away from the corner 
vertices. 
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3. Discretization 



We discretize (J7J) and ()9]) using a Nystrom scheme based on compos- 
ite polynomial interpolatory quadrature and a parameterization z(t) of r . 
Coarse meshes with four quadrature panels per square side are constructed 
on T , see Figs. [T] and [3J We also need fine meshes obtained from coarse 
meshes by subdividing panels neighboring corner vertices n su b times in a 
direction towards the vertices. 

The layer densities p(z) and u(z) in (J7]) and fl9]) are smooth on most 
quadrature panels. We choose quadrature nodes and weights according to 
composite 16-point Gauss-Legendre quadrature in parameter t on such pan- 
els. This quadrature has panelwise polynomial degree 31. 

Panels neighboring corner vertices of ordered arrays of squares or special 
corner vertices of checkerboards require special attention and will be referred 
to as special panels. The layer densities p(z) and u(z) may undergo rapid 
changes there. This is so because of strong singularities that arise in VU(z). 
For checkerboards, as cr 2 /o"i — > oo, this field is barely square integrable in D 



and barely absolutely integrable on r , see Section 2.3 of |15|. See p. 378 



of Milton |25j for a discussion of similar singularities that arise at corner 
vertices as o^/^i approaches values in the range [—3, —1/3]. 

Legendre nodes are not optimal for capturing the behavior of layer den- 
sities on special panels. Rather, it pays off to bunch quadrature nodes in 
a direction towards the vertices. An experimental investigation, see Sec- 
tion I8.2[ shows that nodes corresponding to zeros of the Jacobi polynomial 
P(x)^3 on the canonical interval x G [—1, 1], for certain a and (3, are more 
efficient. For checkerboards and with the corner vertex at a special panel's 
right endpoint, we take a = (ctl/o^) ' 4 — 1 and (3 = 0. With the corner vertex 
at a special panel's left endpoint, we take a = and (3 = (ci/o^) ' 4 — 1- For 
ordered arrays of squares at negative contrast ratios we take a = 1CT 6 — 1 
and (3 = or a = and /3 = 10~ 6 — 1. The corresponding quadrature weights 
are determined so that the panelwise polynomial degree is 15. 

A discretization in parameter t on the coarse mesh of a checkerboard gives 
N coa = 128iV S q points Zi = z{U) and the same number of weights Wi. On the 
fine mesh there are N^ n = (128 + 64n su b)A^ sq discretization points. The square 
array of squares has iV coa = 256 and iVfi n = 256 + 64n su b. The staggered array 
of squares has iV coa = 512 and iV fin = 512 + 128n sub . We collect quadrature 
weights on the diagonal of matrices W for later use. The subscripts 'coa' and 
'fin' are used to indicate the coarse mesh and the refined mesh, respectively. 
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4. Short-range preconditioning 



Consider now the Fredholm second kind integral equations and (j9]) in 
the general form (pQ) where g(z) is piecewise smooth. Let K(t, z) denote the 
kernel of K. Split K(t, z) into two functions 

K(t,z)=K*(t,z) + K°(t,z), (11) 

where K*(t, z) is zero except for when r and z simultaneously lie in a neigh- 
borhood T£ centered around a particular 7^ or 5k- Then K°(r, z) is zero. 
The neighborhoods Y* k cover four coarse panels around 7^ of a square array 
of squares and eight coarse panels around 5k of a staggered array of squares 



and around 7^ of a checkerboard. Compare Section 3.2 of |15l . 

The kernel split (fTTj) corresponds to an operator split K = K* + K° where 
K° is a compact operator. After discretization ([T]) assumes the form 

(I + K* + KV = g> (12) 

where I, K*, and K° are square matrices and /x and g are columns vectors. 
Note that K* is sparse and block diagonal. The blocks of K* oa corresponding 
to 7^ of a square array of squares have size 64 x 64 while the blocks corre- 
sponding to 5k of a staggered array of squares and to 7^ of a checkerboard 
have size 128 x 128. 

The change of variables 

fx(z) = (I + K*)- 1 m (13) 

makes (1T21 read 

(l + K (I + K*)- 1 )/i = g. (14) 

This right preconditioned equation corresponds to the discretization of a 
Fredholm second kind equation with a composed compact operator and the 
solution /2 is the discretization of a piecewise smooth function. There should 
be no ill-conditioning in fll4p due to mesh refinement close to corner vertices 
and we can view (I + K*) _1 as a short-range preconditioner for (I12p . There 
will, however, be ill-conditioning in f[T4"|) for parameter values X(z) that are 
very close to eigenvalues of ([7]) and Q). 
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5. Compression of the preconditioned equation 

The matrix K° and the right hand side g in (I14p can be accurately evalu- 
ated on a grid on the coarse mesh. Only (I + K*) -1 needs a grid on the refined 
mesh for its accurate evaluation. We introduce the compressed weighted in- 
verse 

R = P^(I fin + K^ n )- 1 P. (15) 
Here P is a prolongation operator from the coarse grid to the fine grid , 



Pw = Wfi n PW^a is a weighted prolongation operator, see Section 5 of |13 
Furthermore, the block- diagonal iV coa x N coa matrix P^P, where superscript 
T denotes the transpose, has the property 

P5rP = I. (16) 

Strictly speaking, the relation (|T6|) does not hold exactly for matrix blocks 
corresponding to special panels. It holds, however, also for these blocks that 

SWcoaP^Pfj = f,W coa f} , (17) 

where fj and fj are discretizations of piecewise polynomials on the coarse grid 
of degree i and j and % + j < 15. One can say that ffTB]) holds to the same 
polynomial degree as the overall quadrature holds. 
With (115]) . equation (fl4~]) assumes the form 

(Icoa + K c ° oa R)A coa = gcoa. (18) 

The single-layer equation ((7j) will be used in this form in the numerical ex- 
amples of Section [HJ In terms of the new discrete density fi coa = R/i coa one 
can also write ( ITS]) in left preconditioned form 

(I coa + RK° oa ) /x coa = Rg coa . (19) 

The double-layer equation will be used in this form in the more elaborate 
scheme for complicated unit cells developed in Section [7] 
Functionals on fi(z) of the type 



f(z)fi(z) dz = j f(z(t))fi(z(t)) z'(t) dt , (20) 
where f(z) is a piecewise smooth function, assume the discretized form 

fcoa^coaAcoa i (21) 

where f is a column vector and Z is a matrix containing discrete values 
z' i = z r (ti) multiplied with weights w,- t on the diagonal. 
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Figure 4: Local meshes close to corner vertices. Left: the square array. Right: the staggered 
array with meshes centered around a double-corner concentration point 8k- The vertex sepa- 
ration distance is d. Meshes with index j = n re c nave the largest panels and M nc coincides 
with the coarse mesh on Tq in a neighborhood of Sk, see the rightmost image of Fig. [TJ The 
panels on meshes with index i— 1 are half the size of those on meshes with index i. 

6. Recursive construction of R 

The compressed inverse R has the same block diagonal structure as K* oa , 
see Section |H Its construction from the definition (II 5p is costly when the 
refined mesh has many panels. Actually, the number of subdivisions needed 
to reach a given accuracy may grow without bounds due to the singularities 
in that arise as <Jij(J\ approaches certain values, see Section [3j 

Fortunately, the construction of each block R& of R, associated with 
a corner vertex 7^ or with a double-corner concentration point 5k, can be 
greatly sped up and also stabilized via a recursion. This recursion uses grids 
on local meshes centered around a 7^ or a 4, see Figs. Eland HI 

6.1. General recursion 

The staggered array of squares needs the recursion in the general form 

R fc = V T Whc (F{R£i 1)fc } + II + K° bfe ) _1 P bc , % = 1, . . . , n rec , (22) 

where the number of recursion steps n rec corresponds to rw> of the refined 
mesh and where R^ = R& for i = n Tec . See Section 6 of [13] . The weighted 
and unweighted prolongation operators Pwbc and Pt> c act from a 128-point 
grid on a local mesh M. ic to a 192-point grid on a local mesh M.^ see the 
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right image of Fig. |H The superscript V in (j22p has a meaning which can 
be explained by considering the discretization of K on a 192-point grid on 
M.ih and on a 128-point grid on M.(i-i) c - Let the resulting matrices be Kjbfc 
and K(j_!) c fc. Now K° bfc is the 192 x 192 matrix which results from zeroing 
all entries of Kjbfc that also are contained in the 128 x 128 matrix K^iv/.. 
The operator F{-} expands an 128 x 128 matrix into an 192 x 192 matrix by 
zero-padding in such a way that F{K(j_!) cfc } + K° bfc = Kj b /c- 

6.2. Fixed-point iteration and Newton's method 

The recursion (|22|) can be simplified for square arrays of squares and for 
checkerboards thanks to scale invariance of the integrals in ([7]) and (j9]). The 
local meshes Aiib and hAi C look the same at all recursion steps and the index 
i can be dropped, see the right image of Fig. [3] and the left image of Fig. HI 
The recursion (1221) assumes the form of a fixed-point iteration 

R ifc = V T Whc (F{R^ 1)fe } + 1° + K° fc ) ' P bc , % = 1, . . . , n rec , (23) 
which for n rec — > oo can be cast as a non-linear matrix equation 

G(R fc ) = P^ bc A(R fc )P bc - R fc = , (24) 

where 

A(R*)= (F{R^ 1 } + IS + Ky- 1 , (25) 



see Sections 3.2 and 3.3 of [15|]. The non-linear equation (|24p . in turn, can be 
solved for R^ with a variant of Newton's method. Let X be a matrix-valued 
perturbation of R^ and expand G(R&+X) = to first order in X. This gives 
a Sylvester-type matrix equation 

X - P^ bc A(R fe )F{R fc : XR fc 1 }A(R fe )P bc = G(R fc ) (26) 
for the Newton update X. One can use the Matlab built-in function dlyap 



for (j26j) . but GMRES 28] gives a smaller residual and we use that method. 



6. 3. Initialization, number of recursion steps, and homotopy 

The recursion (|2"2"j) . the fixed-point iteration (l2"3"j) . and Newton's method 
for (|2~4"]) need to be initialized and n rec has to be decided in (1221 and in (123]) . 
The three types of domains call for different strategies. 
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Figure 5: Convergence of associated with a jk of an ordered checkerboard with <72/cri 
10 8 . The 500 fixed-point iteration steps in (|23p are followed by 9 Newton steps for (|24p. 



Random checkerboards at high contrast ratios are the easiest to deal with. 
Here we first use the fixed-point iteration (|23|) . initialized with 

F{R ofe 1 } = I c + K cfc , (27) 

where subscript 'c' refers to a discretization on mesh _M C in Fig. [3j Com- 
pare eq. (24) of [15|. The iterations are stopped when either ||G(Rfc)||/||Rfc|| 
is smaller than 10e mac h in Frobenius norm or a number of 500 iterations 
is reached. The final fixed-point iterate is then used as initial guess in 
Newton's method for ( l24"j) . The Newton iterations are stopped when either 
||G(Rfe)||/||Rfe|| is smaller than 100e mac h or a maximum number of 15 itera- 
tions is reached. Fig. [5] illustrates this strategy for an R fe associated with a 7^ 
of an ordered checkerboard at a^/ci = 10 8 . One can see that the convergence 
of the fixed-point iteration is very slow. About 2 ■ 10 5 steps, corresponding to 
the same number of subdivisions of the fine mesh, would be needed for full 
convergence if only the fixed-point iteration was used. This clearly shows the 
power of Newton iterations and explains why methods relying solely on mesh 
refinement run into great difficulties on these types of domains. There are 
only 16 possible corner configurations in a random checkerboard, correspond- 
ing to 16 distinct blocks of R. Therefore, the time- and storage requirements 
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for computing R are negligible for large unit cells. 

Square arrays of squares at negative contrast ratios are more difficult to 
treat. This is so since the solution to the electrostatic equation may only exist 
as a limit for 02/01 approaching the negative real axis, see Section [27TI Again 
we first use the fixed-point iteration (1231) . initialized as in ( |2"7|) but with 02 
(which enters into K£ fc ) multiplied with a constant q — 1 — O.Oli. Again the 
final fixed-point iterate is used as initial guess in Newton's method for fHM|) . 
Now, however, we use a homotopy method and at each Newton step we 
reduce the imaginary part of q with a factor of ten. After 14 such iterations 
q is set to unity and an additional maximum of 15 Newton iterations are 
performed. In this way the final expression for may be complex valued 
even though the last few matrices K£ fc , fed into (12~4|) . are purely real. 

Staggered arrays of squares at negative contrast ratios are the most intri- 
cate. Here the choice of n rec and of initializer in ( 122]) are very important. We 
choose n Tec large enough so that the vertex separation distance d, see Fig. HI 
at the first recursion step (i = 1) is at least 10 16 times larger than the part 
of T£ covered by the mesh Aiu- In this way the interaction between the two 
connected parts of M.\b is negligible. The initializer R £; is then chosen as a 
compressed inverse computed using the homotopy method just described for 
the Rfc of the square array of squares, neglecting the interaction between the 
connected parts of M.\b- 



7. Long-range preconditioning 

As the number of squares in a unit cell grows, the problem of computing 



0* gets harder, see Section 11.11 This section improves on Section 4 of [15 
and describes a long-range preconditioner for (Q which cures these problems. 
The main idea is to split the unknown layer density into two parts and 
capture all long-range interaction in a matrix S, which can be rapidly inverted 
and used in a right-preconditioner. In combination with the short-range 
preconditioner R of Section [5], applied from the left, this results in a scheme 
whose computational cost for checkerboards with large random unit cells at 
high contrast ratios is almost linear in N sq . 

7.1. An expanded equation 

Each square in Dq has a boundary consisting of four straight segments, 
see Fig. [2j Introduce piecewise constant local basis functions Sk(z), k = 
1, . . . , iV sq , on r U L such that Sk{z) = 1 when z lies on a boundary part 



15 



of square k with positive orientation, s^{z) = — 1 when z lies on a boundary 
part with negative orientation, and s^{z) = otherwise. 



Following Section 4.1 of [15] . we take fl9]) in the general form ([I]) and 
expand it into the system 



iVsq-l . 

{I + K)hq{z) + a M 
fc=i ^ 



Sfe ( z )-A(^)| Sfc (z)| + ^)=^). (28) 



s k (z)fio(z)d\z\=0, k = l,...,N sq -l, (29) 

where ^Iq(z) mimics the rapidly varying behavior of fi(z) and are unknown 
coefficients. 

Discretization of (l2"8"j) and ( I2"9"j) together with left preconditioned compres- 
sion, compare (II 9p . results in the linear system 

(I coa + RK° oa ) /x 0coa + R (Bi - AcoalBxl + A coa u T ) a = Rg coa , (30) 

B^|Z coa |A coa = 0. (31) 

Here Bi is a N coa x (N sq — 1) matrix whose fcth column is the discretization 
of Sk(z), A coa is a column vector whose N COSL entries is the discretization of 
A(z), A coa is matrix containing A coa on the diagonal, u is a column vector 
with N sq — 1 entries all equal to 2/iV sq , a is a column vector containing the 
N sq — 1 coefficients a fc , and vertical bars denote entrywise absolute value. 
The effective conductivity ( ITUj) can be computed from 

°* = $ {Ca Z coa (Aocoa + B l a ) } , (32) 

once ( 13"0"|) and ( IHTj) is solved. Compare (]2"T]). 

7.#. ^4n important simplification 

The definition of Sfc(z) together with Cauchy's integral theorem implies 
that 

eLZcoa B i = 0. (33) 

As a consequence, the second term within parenthesis in ( 13"2"|) does not con- 
tribute to o~ * and can be omitted. 
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The fact that the coefficients a k are not needed in (J32J) opens up for 
another, more important, simplification. With the change of variables 



a k + 



(q(fc)-q(iV sq )) Egr^ 



(34) 



where a{k) denotes the conductivity of square k, the expanded equation f|28 
assumes the simpler form 









Rg 




b 








{I + K)fi {z)+ b k {s k {z) - \{z)\s k {z)\) = g{z) 

k=l 

and ( l30l) reduces to 

(I coa + RK c ° oa ) /i 0coa + R (Bi - AcoalBxl) b = Rg coa . 

7.3. A Schur complement style preconditioner 

The system ( 1361) and ( l3TT) can be written in partitioned form 

I + RK° B 
C 

where subscripts 'coa' are omitted and 

B = R(Bi — A coa |Bi|) , 
C = B 1 |Z coa | . 
The change of variables 

I-BS X C BS 1 

s- x c -s- 1 

where the (iV sq — 1) x (A^ sq — 1) matrix S is given by 

S = CB, 

transforms f|3T|) into 

I + RK°(I-BS X C) RK°BS 1 
I 







I 


B " 


-l 






b 




C 







c 











c 









Rg 




c 








From f)42p it is obvious that c = and we can write 
equation for uj: 

(I + RK°(I — BS _1 C)) = Rg . 
The effective conductivity (|3"2|) can be expressed in terms of u) as 
a, = S{eLZcoa(I-BS- 1 C)a;} . 



(35) 
(36) 

(37) 

(38) 
(39) 

(40) 
(41) 

■ (42) 
as a single 

(43) 



(44) 
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7.4- The inverse of S 

The matrix S of (|4ip is sparse. In the limit X(z) — > it approaches a 
standard five-point stencil for the discrete Laplace operator. As we soon 
shall see, Matlab's sparse banded solver, obtained using 'backslash', is very 
efficient at solving linear systems with S as system matrix, at least for system 
sizes up to N sq = 1.6- 10 6 . We shall use that method in all numerical examples. 

The condition number of S seems to be lower when a(N sq ) = oi than 
when a(N sq ) = o\. Therefore, in our numerical examples, we permute the 
unit cell so that <r(N sq ) = <j 2 . 

7. 5. Reduction in the number of unknowns 

Some entries of uj in (j4"5]l are easy to solve for. To see this, let r eq be the 
part of T that lies between squares of equal conductivity. From (Q it follows 
that X(z) = for z G T cq . This means that all entries of K° and R whose first 
index corresponds to a discretization point ^ e r cq are zero except for the 
diagonal entries Ra which are one. From (j4~3l) we get the simple entry wise 
relation 

= 9i, Zi e r eq . (45) 

Furthermore, the vast majority of these elements are zero thanks to lo(z), 
see the right hand side of (jUJ). 

Eq. (1451) can be used to reduce the number of unknowns in (f43|) . The 
savings are huge when the area fraction p is high or low. For simplicity, we 
only remove the known entries of uj which are zero. The reduced system 
assumes the form 

(jl+iRjK^I-BS^C))^ =|Rg. (46) 

Here uj u are the remaining entries of uj, 'downarrow' indicates that rows of a 
matrix are deleted, and 'rightarrow' indicates that columns are deleted. One 
can see in (PIS"]) that the reduction in the number of unknowns does not induce 
a similar reduction in the size of K°. No columns are deleted. Therefore, the 
speedup resulting from (|46j) is not as great as the savings in storage. 

8. Numerical examples 

This section investigates the complexity and the achievable accuracy of 
our scheme (TTSi) for ordered arrays of squares and fT45|) and (HBT) for random 
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checkerboards. We also compare with recent numerical results |6j obtained 
with the finite element solver Abacus. 

The numerical examples are performed in the Matlab environment (ver- 



sion 7.9). The GMRES iterative solver |28| and a threaded version of the 
fast multipole method coded in C with SIMD instructions, is used for 
the main linear systems. The stopping criterion threshold is set to machine 
epsilon (e mac h). See Section 4.1 of 11] and Section 3 of [i| for how to impose 
periodic boundary conditions on potential fields due to charges in a unit cell. 
The examples involving N sq up to around 10 6 are executed on a workstation 
equipped with an IntelXeon E5430 CPU at 2.66 GHz and 32 GB of memory 
while all other examples are executed on a workstation equipped with an 
IntelCore2 Duo E8400 CPU at 3.00 GHz and 4 GB of memory. 

When estimating accuracy we rely on some exact relations available for 



two-component media and compiled in Chapters 3.2 and 8.7 of Milton [25 
Let us consider a* and the effective conductivity tensor cr* as functions of 
o\ and cr 2 . Then, using a duality transform and the homogeneity of cr*, one 
can show the following relation between an original material and that of a 
material where the components have been interchanged 

<r*(cr 2 , cti) = o-iovjc*^ det(<r*(cri, ct 2 )) . (47) 

Another useful relation which holds for overall isotropic materials is 

0-*(0-i,er 2 )o-*(l/o-i, l/a 2 ) = 1 • (48) 

An ordered checkerboard has 

(?* = Vo\(72 (49) 
and a square array of squares at p = 0.25 has 



VVi + Sa 2 )/(3a 1 + a 2 ) . (50) 

We also observe, see Chapter 1.7 of j25|, that the effective conductivity of a 
random checkerboard at p = 0.5 obeys 

lim cx* = Jo\(j 2 • (51) 

iV sq ->oo 

Note that the tensor cr* has four elements and that they all can be com- 
puted via fT43|) (or f )46|) ) and (jHJ). For example, choosing e = l both in fj4"3]) . 
where e appears in g, and in (jHJ) makes a* assume the value of cr* xx . Choos- 
ing e = 1 in fl4"3"]) and e = i in (|4"4"|) makes a* assume the value of a* yx . 
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Figure 6: Unit cells of two random checkerboards used in the numerical examples. The area 
fraction of squares with conductivity is 02 is p = 0.5. Left: N sq = 10 4 . Right: N sq = 10 6 . 
Please zoom to see the details of the microstructure in the right image. 

8.1. Timing and convergence to statistical limit 

A sequence of 105 random checkerboards is constructed with unit cell 
sizes ranging from iV sq = 4 to N sq = 1615441. All unit cells have o~2/o"i = 10 6 
and p = 0.5, see Fig. [6] for two layouts. The effective conductivities a^ yy of 
the checkerboards are computed via (j4"6l) and (|44p . 

Almost all computing time is spent in the GMRES solver. The setup 
time for S of ( 14T1) at N sq = 10 6 , for example, is only about 0.4% of the total 
computing time. The number of iterations needed for full convergence is 
bounded by 18 and the left image of Fig. [7J shows that the time spent in 
GMRES grows approximately linearly with N sq , reflecting the complexity of 
the fast multipole method. The total time spent applying the inverse of S, 
which is included in the time spent in GMRES, is also shown separately in 
the left image of Fig. [7J One can see that while this time grows faster than 
linearly, it is still less than 9% of the total computing time at iV sq = 10 6 . 

The right image of Fig. [7J shows the actual values for the effective conduc- 
tivities a* of the checkerboards, presented in terms of their relative deviation 
from the statistical limit ( IBTl) . At N sq = 1615441, which is the largest unit 
cell we can handle due to memory constraints, the deviation is about 1%. 
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Figure 7: Solving for the effective conductivity of random checkerboards with (T2/&1 = 10 6 
and unit cells of increasing sizes. The area fraction of squares with conductivity 02 is p = 0.5. 



8.2. Achievable accuracy 

Fig. [BJ left image, illustrates how the placement of quadrature nodes 
influences the achievable accuracy for progressively higher contrast ratios. 
The unit cell is that of an ordered checkerboard with N sq = 4. The circles 
show that the relative error in a* grows roughly as (o^/Vi)*" 5 when Legendre 
nodes are used on all panels. This was the strategy in [15j. The stars show 
that the growth rate becomes linear in o^/ci when Jacobi nodes are used 
on special panels. This is the strategy of the present paper, see Section El 
Several extra digits can be obtained at high contrast ratios. 

Note that for 02/ci > 10 16 , accurate results are impossible in double pre- 
cision arithmetic. This is so since X(z) of (J7|) and (Q is then indistinguishable 
from unity. The integral equations become independent of 02 while the ref- 
erence solution (149 p is not. The error growth rate produced by the Jacobi 
nodes in Fig. [H] could therefore be thought of as optimal. 

Three sequences of checkerboards are now constructed with unit cell sizes 
ranging from iV sq = 4 to iV sq = 1468944 and with p = 0.5. The first sequence 
consists of random checkerboards with 02/01 = 10 6 . The relative errors in 
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Figure 8: Achievable accuracy in the effective conductivity a*. Left: Two strategies for placing 
nodes on special panels are compared. Errors outside of the range [e mac h, 1] are shown as either 
fmach or 1, whichever is closest. Right: error growth as a function of unit cell size. 



their computed effective conductivities are estimated via ( H7|) as 

1 1 cr* (0-2,01) - 0i0 2 0% t (0i, 2 )/ det(cr*(0 1 ,0 2 ))||2 /coN 

n — 1 \\\ • ^ 2 > 

||ct*(0 2 ,0i)||2 

The second sequence consists of ordered checkerboards with 02/ci = 10 6 
and is used as reference solution. The third sequence is the same as the 
second sequence, but the contrast ratio is increased to 0- 2 /°"i = lO 8 . 

Fig. El right image, shows the results and it has several interesting fea- 
tures. For example, one can see that: 

• the error in a* seems to be independent of the unit cell size. This is so 
because the total error is dominated by the error caused by corner self- 
interaction, computed in local coordinates. The error from long-range 
interaction is comparatively small except for N sq > 5 • 10 5 . 

• the error estimate for cr* of random checkerboards (|52|) . based on dual- 
ity, agrees well with the error estimate for 0* of ordered checkerboards, 
based on an exact answer fl49l) . 
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Figure 9: Random checkerboards with iV sq = 90000 squares in the unit cell and a^/ci = 10 s - 
The area fraction of squares with conductivity 02 varies from p = to p = 1. 



8.3. Continuum percolation 

In theoretical materials science it is of interest to study the effective 
conductivities of continuum two-component random composites as p varies. 
Fig. [9] shows such a study for a unit cell with N sq = 90000 and 02/01 = 10 8 
along with the error estimate ( 1521) . Two sequences of realizations are shown 
- one based on sequential random addition and one where all realizations are 
independent. It is obvious, from the jagged shape of the curve in the left 
image and also from the results in Section 18. 1[ that we are far from the sta- 
tistical limit. Percolation thresholds are visible at p ~ 0.41 and at p ~ 0.59. 
These numbers are consistent with classic results on site percolation for a 



square lattice [22J. The overall behavior of function of p m Fig. [9] 



is in agreement with the discussion on p. 207 in Milton [25j and also with 



results obtained with a discrete network model [23j but it stands in contrast 
to results obtained with the finite element method in Fig. 2(a) of [6]. There 
only one percolation threshold is observed. 

8.4. The square array of squares 

Fig. [TU1 shows computed values of 0*/0i for the square array of squares at 
p = 0.25 for negative ratios 02/01- The relative error, with (150]) as reference 
solution, is shown in the right image. The error is close to e mac h except for in 
a neighborhood of three points where it is higher: the 'pole' or 'resonance' 
at 02/01 = —3, the 'essential singularity' at 02/01 = — 1, and the 'zero' at 
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Figure 10: Left: the effective conductivity <r*/cri of a square array of squares atp = 0.25. The 
curves are supported by 349 adaptively spaced data points. Right: the relative error with (f50|) 
as reference solution. 



— —1/3. See [27| for an explanation of the significance and physical 
meaning of these terms. Note that at o^/^i — — 1 we have A = ±oo and 
that ([7]) then becomes a first kind equation. Compare also Fig. 2 of (27| . 
which is similar to the left image of our Fig. \TU\ but where some problems 
are encountered along the branch cut a 2 /<T 1 G [—3,-1/3]. 

8. 5. The staggered array of squares 

Staggered arrays of squares at area fractions close to p = 0.5 exhibit rich 
resonant spectra on the negative real axis and pose greater challenges to 
numerics than the example of Section 18.41 More data points are required 
to resolve a*. For modeling purposes it is convenient to describe staggered 
arrays in terms of a parameter do, related to the area fraction p and to the 
vertex separation distance d, see Figs. [U and HI as 

v= jz^w a,id d= ^£^- (53) 

The left image of Fig. [11] for <i = 10~~ 10 shows an oscillatory behavior of 
<j*I<J\ for a 2 /(Ti G [—3,-1/3] and a number of resonances on the negative 
real axis outside of this interval. The right image shows that the relative 
error in these computations, estimated via how well ( HHl) is met, is typically 
on the order of 10 2 e mac h. Close to the eigenvalues of ((TJ), some of which 
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correspond to poles of cr*, the error is of course larger. The largest relative 
error encountered in this example is estimated to 10~ 8 . 

9. Conclusions 

The homogenization of composite materials with large random unit cells 
of squares at extreme material property ratios is a canonical problem in the 



theory of composite materials. It has fascinated researchers for decades 25 
The domains look simple, yet they are intriguing. There are analytical results 
available for special cases, yet numerical solvers run into trouble. Only a 
few years ago, numerical solutions to the type of homogenization problems 
presented in this paper would be considered far out of reach. 



The present work epitomizes and stretches a recent line of research 13 



14j . Il5l . 1171 . Il8j to a new high. We first show how to treat simple unit cells 
with (almost) optimal accuracy using a short-range preconditioner. We then 
show that larger unit cells pose no extra problems when a new long-range 
preconditioner is added. Our algorithm has (almost) linear complexity in 
both execution time and storage requirement. Problems involving unit cells 
with a million of squares can be solved to very high precision in a few hours. 
Homogenization on checkerboard-like domains have become a simple task. 

How useful is our new scheme? The coupling of checkerboard problems 
to real-world physics is elusive. One may question the relevance of the small 
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length-scales needed for the resolution of various singular fields. Neverthe- 
less, a recent surge in physicists' interest in metamaterials has given new 
momentum to the study of these issues 27|. The difficulties arising in ran- 



dom checkerboard problems may, further, be representative of the sort of 
troubles that arise in several real-world problems. Since integral equation 
methods are widely applicable, it is therefore likely that our scheme and gen- 
eralizations thereof have many immediate applications. Further work based 
on the present scheme and directed towards metamaterial applications is in 
progress [16| . 
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